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Abstract 

We address the question of optimal transport from a maximizing 
probability to another. Consider the shift tr acting on the Bernoulli 
space E = {1, 2, n}"^. We denote E = {1, 2, n}^ = E x S. We 
analyze several properties of the maximizing probability ^oo.A of a 
Holder potential ^ : E — )■ R. Associated to A{x), via the involution 
kernel, W{x,y)^ ly : E ^ R, one can get the dual potential A*{y)^ 
where {x, y) G E. We assume that the maximizing probability /ioo,A 
is unique. Consider /ioo.A* a maximizing probability for A*. We also 
analyze the same problem for expanding transformations on the circle. 

We would like to consider the transport problem from fioo,A to 
Moo, A* ■ In this case, it is natural to consider the cost function c{x, y) — 
I{x) — W{x, y) + 7, where / is the deviation function for ^.oo,A, as the 
limit of Gibbs probabilities ^.fjA for the potential {3 A when j3 ^ oo. The 
value 7 is a constant which depends on A. We could also take c = —W 
above. We denote hy K — lC{^oo,A, P^oo.A*) the set of probabilities 
fj{x,y) on E, such that 7r*(7}) = ^jLoo^a, and 7r*(?7) = ^jLoo.A' ■ 

We describe the minimal solution /I (which is invariant by the shift 
on E) of the Transport Problem, that is, the solution of 



.inf. y J c{x,y)d'n ^ - max J J {W{x,y) - j) dr]. 

The optimal pair of functions for the Kantorovich Transport dual Prob- 
lem is {—V,—V*), where we denote the two calibrated sub-actions by 
V and V* , respectively, for A and A* . For a certain class of potentials 
A we show that the involution kernel W satisfies a twist condition and, 
finally, we analyze, in this case, if the support of /i is a graph. We also 
analyze the question of finding an explicit expression for the function 
/ : E — > M whose c— subderivative determines the graph. 
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1 Introduction 

It seems natural to try to investigate the connections of Transport Theory 
with Ergodic Theory. Some results on this direction appear in |K1| and 
|BORj . Here we follow a different path. 

Given a continuous function A : T, = {1, 2, 3, .., d}^ — )• M, we call //oo,A a 
maximizing probability for A, if / Adv attains the maximal value in fioo.A, 
when the probabilities v range among the set of invariant for the shift acting 
on the Bernoulli space S. We denote by m{A) this maximal value. 

Such maximizing probabilities fJ:oo,A can be seen as the equilibrium states 
at zero temperature [CGJ [CLTj |Lej [Jenkinsonl] [Bouschlj [Moj for a sys- 
tem on the one dimensional lattice N with d spins in each site and under 
the influence of an interacting potential A. 

A main conjecture on the area claims that for a generic Holder potential 
A the maximizing probability has support in a unique periodic orbit for the 
shift. For a partial result see |CLTj . 

We address the question of finding the optimal transport plan from a 
certain maximizing probability to another. More precisely, we would like 
to consider the transport problem from fioo,A to fJ-oo,A*, where A : T, = 
{l,2,3,..,d}^ ^ M is a Holder potential and A* its dual (see [BCT]). We 
consider here that A acts on the variable x and A* in the variable y. We 
will describe bellow in all details the setting we are going to consider in the 
present paper. We will also provide several examples to illustrate the theory. 

We assume here in most (but not all) of the results that the maximizing 
probability /ioo,A (on S) for A is unique. 

We denote by fi the minimizing probability over 

S = {l,2,3,..,d}^ = S X S, 

for the natural Kantorovich Transport Problem associated to the — W, where 
W{x,y), for {x,y) € E x E, is the involution kernel associated to A (see 
[BLT]). 

We will denote by a the shift on S. 

We point out that by its very nature the Classical Transport Theory 
is not a Dynamical Theory (in the sense of considering invariant probabil- 
ities) |Vil) |Vi2j |Ra) . One have to consider a cost which is obtained from 
dynamical properties in order to get optimal plans which are invariant for 
a. 

The probability fimax, the natural extension of Hoo,A, is described in 
[BLT] . 

First we show that: 

Theorem 1. The minimizing Kantorovich probability /i on S associated to 
—W, where W is the involution kernel for A, is fimax- 
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The calibrated subactions V play an important role in Ergodic Opti- 
mization. They can help to find the support of the maximizing prob- 
ability (see [Jenkinsonl] or [CLTj for instance). Moreover, if we denote 
R(x) = V{a{x))-V{x)-A{x) + m{A), then I{x) = En^(^"(^))) defines a 
nonnegative lower semicontinuous function (can be infinite at several points) 
which is the deviation function for the family of Gibbs states associated to 
A when the temperature converges to zero |BLTj (see [BCLM^ |LM5j for 
the case of the XY model). For a class of explicit nontrivial examples of 
subactions V see |BLMj . 

Theorem 2. If V is the calibrated subaction for A, and V* is the calibrated 
subaction for A* , then, the pair {—V, —V*) is the dual (— Ty+/)-Kantorovich 
pair of {fj. 

oo,yl) MoojA* ) ) when / is the deviation function for A. 

Finding the optimal transport measure between two probabilities is the 
solution of the so called relaxed problem |Vil] . If we want to find a measur- 
able transformation (the Monge problem) which transfers one probability to 
another we need to show that the graph property is true in the support of 
such probability (which does not always happen if one considers a general 
cost function) |Vil| . 

Finally, we analyze here the graph property for the support of the fimax 
(over S = {1, 2, 3, .., (i}^) which is the minimizing probability for the cost 
function —W. 

One can consider in the Bernoulli space S = {0, 1}^ the lexicographic 
order. In this way, x < z, if and only if, the first element i such that, 
Xj = Zj for all j < i, and Xi ^ Zi, satisfies the property Xj < Zj. Moreover, 

(0,Xi,X2, ...) < (l,Xi,X2, ...). 

One can also consider the more general case S = {0, 1, d — 1}^, but in 
order to simplify the notation and to avoid technicalities, we consider only 
the case S = {0,1}^. 



Definition 1. We say a continuous G : 
condition on S, if for any (a, 5) E S = 
a' > a, b' > 6, we have 



S = S X S — 7- M satisfies the twist 
S X S and {a',b') G S x S, with 



G(a, b) + G(a', b') < G{a, b') + G{a\ b). (1) 

The twist condition is inspired in the Aubry-Mather Theory |Ban| [CI| 
|Go] |GT1) |GT2) . It is quite natural concept in Classical Optimization 
and Transport Theory [HI] [Eij (VH] |Vi2] [E] [CLO] [LHS] (see [LO] for 
dynamical examples). 

We point out that in Mather Theory in order to have the graph property 
(see |Matj |CI| ) for the minimal action measure it is necessary to assume 
that that Lagrangian is convex in the velocity. We need in our setting some 
technical assumptions to replace this important property. We believe that 
the twist condition is the natural one. 
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Definition 2. We say a continuous ^ : S — )• M satisfies tlie twist condition, 
if its involution kernel W satisfies the twist condition. 

The involution kernel of A is not unique (see |BLTj ). but if the above 
property is true for some W, then it will also be true for any other one. 
Our final result is: 

Theorem 3. Suppose W satisfies the twist condition on S, then, the sup- 
port of fimax = /i on S is a graph. 

We point out that it can exists (not always) a unique point in the support 
of /t such that its orbit has two points in the support of the vertical fiber. 
But this orbit is a zero measure set. 

A similar definition can be consider for an expanding transformation on 
[0, 1], and we are also able to get the analogous graph property result. This 
also includes the case of T{x) = —2x (mod 1). 

We present in the appendix in the end of the paper several examples 
(and computations) where one can write the involution kernel W explicitly 
and the twist condition is satisfied. 

First we will explain all preliminaries we will need later. 

Consider X a compact metric space. Given a continuous transformation 
/ : X — 7- X, we denote by the convex set of /-invariant Borel probability 
measures. As usual, we consider in Aij the weak* topology. 

The standard model used in ergodic optimization is the triple (X, f,M.f). 
Given a potential A G C^{X), we denote 



We are interested here in the characterization and main properties of 
^-maximizing probabilities, that is, the probabilities belonging to the set 



We will assume here that A is Holder. 

In the following we will also assume that the maximizing probability 
/^oo,A = fJ'oo is unique. 

Under reasonable hypothesis (expanding, hyperbolic, etc.) several re- 
sults were obtained related to this maximizing question, among them \CG\ 
[BUn iB^HschTl IBousch2l [UlTl [HYl [Jenkinsonll Mo[ iLe j IJenkinson2l ILTTI 
[TZl[Sa[BGl[GTTl[GT2] . For maximization with constraints see [GLTUTT^] . 
Questions related to the dynamics on the boundary of the fat attractor ap- 
pear in [LO] . Naturally, if we change the maximizing notion for the mini- 
mizing one, the analogous properties will also be true. 
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Our focus here will be mainly on symbolic dynamics and on expanding 
transformations on or the interval [0, 1]. We recall some basic definitions 
(see |CLT] for example). 

So let (7 : S — > S be a subshift of finite type defined by a matrix C of 
and 1 , where a{xo,xi,X2, ■■) = {xi,X2,X3, ..). In this case we are considering 
X = T. = {1,2,3, and f = a. Remind that, for a fixed A G (0,1), 

we consider for S the metric d(x, x) = A*^, where x = {xq,xi, . . .),5t = 
{xq, xi, . . .) G S and k = min{j : xj ^ xj}. In this situation, given a Holder 
potential A : {1, 2, 3, .., d}^ — > M, one should be interested in ^-maximizing 
probabilities for the triple {T,,a,A4a), where the probabilities are consider 
over B, the u-algebra of Borel of S. In order to simplify the notation here 
we will consider the full Bernoulli space (all entries of C are equal to 1). 

Given an (7^"*"" expanding transformation T of fixed degree on and 
^ : S*^ — )• M we will interested in A- maximizing probabilities on {S^,T, A4t), 
where the probabilities are consider over 13 , the c-algebra of Borel of S^. 

One can consider the analogous setting for C^^" expanding transforma- 
tions of fixed degree over [0, 1]. 

Convex potentials A : [0,1] — > M and the transformation T : [0, 1] — ?> 
[0,1], given by T{x) = 2x (mod 1), were considered in jJenkinson3j where 
it was shown that the maximizing probabilities in this case are Sturm mea- 
sures. For T{x) equal to — 2x (mod 1) however, the situation is completely 
different (see [JSj). 

Definition 3. A function u G CO(S) is a sub-action for the potential A if, 
for any x G S = {1, 2, 3, .., d}^, we have 

n(x) < u{a{^)) - ^(x) + /3a. (4) 
Let (S*,(T*) be the dual subshift. 

In the case of the full Bernoulli space (all entries of C equal 1) then 
S* = {l,2,3,..,d}i^ and a*{yo,yi,y2, ■■) = (yi,y2,..)- 

We consider the space of the dynamics (S,(t), the natural extension of 
(S,o"), as subset of S* x S. In fact, if y = (...,yi,yo) G ^* and x = 
(xo, xi, . . .) G S, then S will be the set of points 

< y,x >= (. . . ,yi,yo\xo,xi, . . .) G S* x E, 

such that {yo,xo) is an allowed word (no restrictions when we consider the 
full Bernoulli space). In this case 

a {. . . ,yi,yo\xo,xi, . . .) = {. . . ,yi,yo,xo\xi,X2, ■ ■ ■)■ 

We point out that we use here the notation < y,x >= (x,y). For 
functions 6 : S ^ M, we denote its value on < y, x > by 5(x, y). 

We define the map r : S ^ S by t(x, y) = Ty(x) = (yo, xq, xi, . . .). 
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Note that, if vr^; : S — > S is the projection in the x coordinate, then, 
Ty{x) = vr^ o 

We denote by 7ry{x,y) = y the projection on the second coordinate. 
Note that a~^{x,y) = {Ty{x),a*{y)). 

Definition 4. A continuous function F : S — )• R is called calibrated subac- 
tion for A, if 

V{x) = max {V{z) + A{z) - m{A)). 

z : a{z)=x 

(In other terms, y is a calibrated subaction if for any x € S, there exists 
z G S, such that, a{z) = x, and V{z) + A{z) — m{A) = V{x) ). 

Note that for all z we have 

V{a{z)) - V{z) - A{z) + m{A) > 0. 

We show bellow some explicit expressions for calibrated subactions for a 
class of potentials A. 

We point out that we will also consider here analogous results for an 
expanding transformation T : 5^ — >■ 5^ (or, T : [0, 1] [0, 1]) of class C^~^°^, 
and a Holder potential yl : 5"^ ^ M (or, A : [0, 1] R) as in [UUT] . The 
case T{x) = —2x (mod 1) is one of the examples we have on mind. 

In this case one could consider analogous problems in xS^, or, xTj, 
if one consider the symbols i which index the inverse branches Tj of T |LUS] 
|L0| . The existence of involution kernel, L.D.P. properties, etc, are also 
true. 

The calibrated sub-action is unique (up to an additive constant) if the 
maximizing probability is unique (see |CLTj |BLT] [GLTj ) . 

We point out that we called strict in |BLT) what we denote here by 
calibrated. 

We will use from now on the notation of [BLTj . 

Definition 5. Given ^ : S ^ M Lipchitz, consider A*{y) (the dual poten- 
tial), where j4 : S* — > R, and W{x,y) = WA{x,y) its involution kernel. 
This means, by definition that for all < y,x >= (x, y) G S 

A*{y) = A{Ty{x)) + W{Ty{x),a*{y)) - W{x,y). (5) 
This expression can be also written in the form 

A*{x, y) = A{a-\x, y)) + W{a-\x, y)) - W{x, y). 

If A depends on just two coordinates we can take A* as the transpose 
of A. Therefore, the above definition extends this concept in the case A de- 
pends on infinite coordinates on the Bernoulli space. We say A is involutive 
iiA = A*. 
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We suppose that c is a normalization constant for W in the sense that 



e'^(-^y)-'^duA^{y)d,yAix) = 1, (6) 

where and i^a* sue respectively the eigen-probability for the Ruelle 
operator of A and A* |CLT| . 

We also denote by (pA and (pA* the corresponding eigen- functions. Fi- 
nally, ^A = T^A (pA = and fiA* = ^A* (pA* are the invariant probabilities which 
are the solutions of the respective pressure problems for A and A* . 

For a fixed A we consider a real parameter /3, and the corresponding 
potentials piA, and the eigenfunctions (ppA-, and so on... 

In Statistical Mechanics /3 is the inverse of temperature. In this way 
asymptotic results when /3 — )■ oo can be consider as the ones which describes 
the system in equilibrium at temperature zero. 

Note that p^W is an involution kernel for P3A, and its dual is piA* . 

It is known (see for instance |CLTj ) that a sub-action V can obtained as 
the limit 

V{x) = lim - \og^pA{x). (7) 

This y is a calibrated sub-action for A (see [ULT] [BLT] [ULT] ). 

We can also get a calibrated sub-action V* for A* using the limit 

V*{y)= \\ia\\og(PpA'{y) . (8) 
From [BLT] we have 

(Pa*{v) = I e'^^^^^y^-^dvAix). 
Finally, we define for each x E S, 

oo 

/(x) = ^[V o a-V -{A-m{A))\a'^ (x), 

n=0 

where F is a (any) calibrated sub-action. 

The function /, where / : S — )• M U {oo}, can have infinite values, but it 
is lower semi-continuous. 

In |BLT| it is shown that for any cylinder set C C S, 

hm \og n p a{C) = - inf I{x) 

/3-s>+oo p a;eC 

In this way we get a Large Deviation principle for fi^A ^ A*oo- 
Remember that we denote by //j^ the unique maximizing probability for 

A* (it is unique because fioo is unique for A, and, moreover, A and A* are 

cohomologous in S). 
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All the results described above are true for expanding transformations T 
of class C^"^" on the circle . In this case we have to consider the natural 
extension T of T. This also includes the case of T(x) = —2x (mod 1). 

In the case T : ^ S^, given by T(x) = 2x (mod 1), we define T in 
the following way: the Baker transformation associated to T, denoted by 
T{xi,X2), where T : [0, 1]^ [0, 1]^, is such that satisfies for all {xi,X2) & 
[0,1?, f {xi,T*{x2)) = {T{xi),X2) (see picture bellow) . In this case T* : 
S^, with T*{y) = 2y (mod 1), T plays the role of a, and T* plays the 
role of a*, on the definitions and results above. 

All the above apply for an expanding transformation T : ^ , oi 
T : [0, 1] ^ [0, 1] 

The transformation T on x S^, contract vertical fibers by forward 
iteration and expand (and cut) vertical fibers by backward iteration. 




Characterization of S 



Remember that we said that T^:S = SxS— )-M satisfies the twist 
condition on S, if for any {a,b) G S = S x S and {a',b') G S x E, with 
a' > a, b' > b, we have 

W{a, b) + W{a, b') < W{a, b') + W{a\ b). (9) 

We have the analogous definition for expanding transformations on the 
interval: 

Definition 6. We say W : [0, 1]^ M continuous satisfies the twist con- 
dition on [0,1]^, if for any (a, 6) G [0,1]^ and (a', 6') G [0,1]^, with a' > a, 
b' > b, we have 

W{a, b) + W{a, b') < W{a, b') + W{a, b). (10) 

Same definition for W on S"^ x S*^. 

When x,y G [0, 1] (or, on S"^), the condition 

d^W{x,y) 
dx dy ' 
implies the twist condition for W. 
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Example 1. Consider the transformation T : ^ S^, given by T{x) = 
— 2x (mod 1) and A{x) = a + bx + cx'^ , where a,b,c are constants and c > 0. 
In item b) in the appendix we show an exphcit expression for the VF-kernel 
and we prove that W satisfies the twist condition. From this, we can get an 
exphcit expression for the cahbrated subaction for a certain potential (see 
remark 6 in the appendix). 

We point out that for considering the system above in we have to 
assume above that A(0) = A(l). If we are interested in the case of [0, 1] the 
same resuh can be obtained but we do not have to assume ^(0) = ^(1). 

Moreover, we also show in item c) in the appendix that a certain class 
of analytic perturbations of A{x) = a + bx + cx^ produces T^-kernels which 
are twist. 

Example 2. In item d) in the appendix we show an example of a VF-kernel 
for a continuous potential A, and for the action of the shift a on the Bernoulli 
space {0, 1}^, which is twist. 

Example 3. Consider the Gauss map T{x) = ^ ~ [^] on [0, 1]. 

We can define the Baker transformation associated to T, denoted by 
f(xi,X2), where f : [0,1]^ [0, 1]^. 

The kernel for ^(xi) = — log T'(xi), which is W(xi, X2) = —2 log(l + 
X1X2) (see [BCT] ). 

It is known that the dual of A = — log T' is A* = — log T' (see Proposi- 
tion 4 in [BLT]). 

The maximizing probability for such potential — logr'(x) = 21og(x) is 
the (5-Dirac in the fixed point b, where b is the golden mean b = (see 
for instance [CGJ). In this case m{A) = 21og(6). 

Note that W is differentiable on any point (xi,X2) € [0, 1]^. 

One can easily see that an explicit calibrated sub-action u (unique up to 
an additive constant because the maximizing probability is unique |GLlj ) 
satisfying 

u{x) <u{T{x)) - A{x) + m{A), (11) 

is u{x) = W{x, b) = -2 log(l + xb). 
Note that 

d^W{x,y) 
dx dy ' 

and, therefore, W is twist. 

Example 4. Suppose T{x) is -2x (mod 1), T : [0,1] [0,1] and A : 
[0, 1] ^ R is Holder and monotonous. Under some assumptions on A one 
can get cases where the maximizing probability is unique and with support 
on the right fixed point p (see |JS|). In the same way as in last example one 
can show that V{x) = W{x,p) is a calibrated subaction. 

If one considers on the interval [0, 1] the potential ^(x) = x^ then we 
are under such assumptions. One can show that A*{y) = y^, and l^(x, y) = 
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(l/3)(a;^ + y^) — (4/3)xy (see remark 5 in item b) in the appendix). In the 
same way Q^Qy < 0. 

Example 5. Consider the transformation T : S*^ — >■ 5^, given by T{x) = 
— 2x (mod 1) and A{x) = —{x — (a continuous potential on S^) for 
which all results in |BLTj apply (see also [LTTj where it is shown in this case 
the graph property). 

The maximizing probability has support in the periodic orbit of period 
2 |Jenkinson3) [J6] . 

One can define the continuous Baker transformation associated to T, 
denoted by T{xi,X2), where T : [0, 1]^ — > [0, 1]^ is such that satisfies for all 
(X1,X2) G [0,1]2, f{xi,T{x2)) = (T(xi),X2). 

In this case, we show in remark 6 in the appendix that a smooth W- 
kernel is: 

W{x,y) = -(l/3)x2 - (l/3)y2 + (4/3)xy - (2/3)x - (l/3)y. 

The dual potential A* is equal to A. 

This VF-kernel is not twist because ^ 'si^Qy^ > 0- 

It follows from a general result presented in [JS] that any maximizing 
measure for this potential is /ioo = (1 — i)'5i/3 + *<^2/3) where t € [0, 1], so the 
critical value is m = ^(1/3) = ^(2/3). 

It is easy to verify that, 

Y{x) = {W{x, 1/3) - W{l/3, l/3))x[o,i/2)(^)+ 
H^(x,2/3)-M^(2/3,2/3)x[i/2,i](^) 

= max{W{x, 1/3) - W{l/3, 1/3), T^(x, 2/3) - W{2/3, 2/3)} 
is a calibrated subaction for A. 




I Cur\e 1 Cur\ e 2 Cur\ e 3 



W{x, 1/3) - 1^(1/3, l/3)=red, W{x, 2/3) - 1^(2/3, 2/3)=blue and 0=black - The 
calibrated subaction is the supremum of the two functions described in the 

picture. 

This calibrated subaction is not analytic but piecewise analytic (see 
|LOS| for more general results). 
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Example 6. Consider the transformation T : 5^ — t- S*^, given by T{x) = 
— 2x (mod 1) and A{x) = {x — (a continuous potential on S^) for which 
all results in [BLT] apply. 

In this case we show in item b) in the appendix that a smooth VF-kernel 

is: 

W{x,y) = (l/3)x2 + (l/3)y2 _ (4/3)xy + (2/3)x + (l/3)y. 

The dual potential A* is equal to A. 
This involution kernel W is twist. 

Similar results can be obtained for T : 5^ — )• 5^, given by T{x) = 2x 
(mod 1) and ^4(3;) = — (x — (a continuous potential on S^) 

Definition 7. Given G : S — )• M upper semi-continuous, and f{x) continu- 
ous, where / : S — )• M, we define the G-transform of /, denoted by f^iy), 
where : E* — >■ M, the function such that 

/#(?/) = max {-/(x) + G(x,y)}. (12) 

We can use also the notation instead of if we want to stress the 
dependence on G. 

In this case we say that f* is the G -conjugate of / }Vilj |Vi2j . We use 
the notation of [R] page 268. 

Note that, if we add a constant to /, then new will be obtained from 
the old one by subtracting the same constant. Therefore, in this case the 
sum f{x) + f*{y) will be the same. 

We are interested, for example, when G = —W or G = —W + /. 

A similar definition and properties can be consider for expanding trans- 
formations on [0, 1]. 

Proposition 1. If y is a subaction for A^ then = is a subaction for 
A*. 

Proof: Given y there exist such that 

V*{a*{y)) - V*{y) = max{-y(x) + W{x,a*{y))}- 

max{-l/(z) + W{z,y)} = 

max{-y(x) + iy(x,a*(y))}- {-V {z^) + W {zo,y)) > 

-V{Ty{zo)) + W{Ty{zo),a*{y))) + V{zo) - W{zo,y) > 

A{Ty{zo))-m{A) + W{Ty{zo),a*{y)) - W{zo,y) = 
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A*{y) - m{A) = A*{y) - m{A*). 

□ 

The subaction you get by — VF-transform is not necessarily calibrated. 
Note that if we add a constant to W (the new W will be also a W- 
Kernel), then all of the above will be also true. 

In a similar way like in the reasoning of last proposition one can get: 

Proposition 2. If V* is a sub-action for A* , then 

{V*)% (x) = max{-V*{z) + W{x, z)} 
is a subaction for A. 



Analogous definitions can be consider for an expanding transformation 
T : ^ S^. This also includes the case of T(x) = — 2x (mod 1). 



2 The transport problem 

We assume the maximizing probability fioo for A is unique. 
We denote by a fixed maximizing probability for A*. 
We denote by /C(/^oo5 M^) the set of probabilities fj{x, y) on S, such that 

<(??) = Moo, and 7r*(7?) = ^j*^ . 

We are going to consider bellow the cost function c{x, y) = I{x) — 
W{x, y) + 7, which is defined for x such that I{x) ^ oo. 

The Kantorovich Transport Problem: Given A (and all the prob- 
abilities described above) we are interested in the minimization problem 



C{^I^,^ll,)= inf / I {I{x)-W{x,y) + j)drj 



~ ^^^^ ^ / / ^(^^y)^^ 



max / {W{x,y) — 'J - I{x)) dfj (13) 
fjeicifioo J J 

where, / is the deviation function for fi^ = lim^_^oo 

Hl3A (see IBLT] ). 
C/3 = / / e'^^(^'^)dz.^^(x)<iz.;3A-(2/), (14) 



and 
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^ = i™ \ C/3 , (15) 

as in proposition 5 in |BLT] . 

We call c(x, y) = —W{x, y) + 7 + I{x) the cost function. Therefore, c is 
lower semi-continuous. 

A probability 77 on S which attains such minimum is called an optimal 
transport probability. We denote it by rnu. 

We will show later that fimax-, the natural extension of /ioo, will be the 
optimal transport probability jl. 

Remark 1: Note that if we subtract the deviation function I[x) of the 
cost function, that is, if we consider a new cost c{x, y) = —W{x, y) + 7, the 
problem above will not change, because I is constant zero in the support of 

Moo • 

In other words 



C{^oo,^ilo) = „ ^inf ^ / {-W{x,y) + -i)dfi, 

and, the optimal transport probability will be the same. 
In some sense this setting is nicer because the cost c is a continuous 
function on S. 

Definition 8. A pair of functions f{x) and /*(y) will be called c-admissible 
(or, just admissible for short) if 

f#{y) = mm {-f{x) + c{x, y)} (16) 

In other words — is the — c-conjugate of — /. 
Note that in this case, Vx € S, y € S*, we have that 

f{x) + f*{y) < c{x,y). 

We denote by T the set of all admissible pairs {f{y), /*(y))- 

Under quite general hypothesis (the probabilities are perfect) the dual 

maximizing problem will give the same value (see |Vil| |Vi2| or Theorem 

1.1.1 in 

The Kantorovich dual Problem: Given A (and all the probabilities 
described above) we are interested in the maximization problem 

D{fi^,^i*^)= max { [ fdf,^ + [ f*dfil,). (17) 

A pair of admissible (/, f^) G T which attains the maximum value will 
be called an optimal pair. 
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We suppose, from now on, that the maximizing probabihty for A, de- 
noted by /ioo is unique. 

We denote, as in [CLTj the cahbrated sub-actions V and V* by 

V{x) = lim ^ log(j)pA{x) and V*{y) = lim \ log0/3A*(y) • (18) 

/3— s>oo p /3— s>oo p 

The above convergence is uniform and V is (up to constant) the unique 
cahbrated sub-action for A (see [CCT] [BUT] [GLlJ V 

We will show later that (/,/*) such that /(x) = —V[x) and /*(y) = 
— F*(y) is the optimal pair. 

Important property: If /i is an optimal transport probability and if 
(/, Z'^) is an optimal pair in J-", then the support of /x is contained in the set 

{ < y, X > G S I such that (/(x) /*(y)) = c(x, y) }. (19) 

It follows from the prime and dual linear programming problem formu- 
lation. The condition above is the complementary slackness condition (see 

US] [Ei] [HM]). 

The reciprocal of this result is also true (see |Vi2j Remark 5.13 page 59). 

If X and y are such that (/(x) -|- /*(y)) = c{x,y) we say that they are 
realizers for the cost c. In |CLOj it is shown that the set of realizers for I—W 
is invariant by the dynamics of a. In this section we are mainly concerned 
with the support and not with all realizers. 

If one finds ft an an admissible pair (/,/*) satisfying the above claim 
(for the support), then, one solves the Kantorovich problem, that is, one 
finds the optimal transport probability fi . 

No we will prove Theorem 1. 

Proposition 3. The minimizing Kantorovich probability /i on S associated 

to -VF is flmax- 

Proof: Proposition 10 (1) in |BLTj claims that if fimax is the natural 
extension of the maximizing probability /ioo, then for all < p*\p > in the 
support of jj-max we have 

-V{p) - V*{p*) = -Wip,p*) + 7. 
This is the same as saying that in the support of fimax 

-V{p) - V*ip*) = -W{p,p*) + 7 + I{p) = cip,p*), 
because I is zero in the support of /Xqo- 

Then if —V{x) and —V*{y) is an admissible pair, then jlmax is the op- 
timal transport probability for such c{x,y). This will be shown in the next 
proposition. 
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We will show bellow that the — c-transform of V is V* . 

□ 

Note that if is a VF-Kernel for A, for all /?, we have that f3W is a 
1^-Kernel for ^A. We denote by C/j the normalizing constant for /3W, as in 
|BLT] ■ It is known that ^ log C/3 = 7. 

Now we will show Theorem 2. 

Proposition 4. The pair {—V,—V*) is admissible. 
Proof: For a fixed y we have to show that 

-V*{y) = i-V)f = inf {-(-y(x))+c(x,y)}. 

This is the same as 



V*{y) = sup { {-V{x)) - c{x, y) } = sup { -V{x) - (7 - W{x, y) + I{x) )} , 
or, for all x 

-V*{y) <V{x) + c{x,y). (20) 

From proposition 3 in |BLT] (we just write here W{x,y), instead of 
W{y, x) there) we have 

j ^pWA{x,y)-cp-\oz^pA{x) d^pA{x). 
Consider now the limit 

V*{y) = lim ^ \og{(t>pA*{y)) = 

p—>-co p 

lim ^ log / e^^'*(^'^)-'=''-'°§'^^^(^)d^;3A(x). 

/3-s>oo p J 

From |CLTj The function ^ log(</>^yi(x)) converges uniformly with /3 to 
V{x). 

Therefore, one can write 

lim i log [ e^^^(^'2')-^''-'°S'^^^(^)(i^/3A(a;) = 

lim I log / e^(^^(^''^)-T-^(^))(i^^A(x) 
Now, by Varadhan's Integral Lemma [DZj we obtain 
V*{y) = sup{WA{x,y)-j-V{x)-I{x)} = sup{-V{x)+W{x,y)--f-I{x)}, 
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where I is the deviation function. 

□ 

Finally, we get that fimax is the optimal transport probability for such 
c{x,y). From now on we will use either the notation ft or jlmax for the 
optimal transport probability. 

In [LOS] Transport Theory is used as a tool to show that in some cases 
the calibrated subaction is piecewise analytic. In j CLQ J some generic prop- 
erties of the potential A is considered and special results about the realizers 
of the W — I are obtained. 

The last theorem says: for any y G S* we have 

V*{y) =snp{-V{x)-c{x,y)}. (21) 
Note that when y = p* , for p* in the support of the supremum 
V*{p*) = sup{-V{x) + W{x,p*) - 7 - I(x)} = sup{-V{x) - c{x,p*)}, 

X X 

is realized at x = p, for p in the support of /ioo (with < p* ,p > in the support 

of fl). 

Remark 2: Remember that, if the maximizing probability for A* is 
unique, then there is a unique calibrated sub-action for A* (up to additive 
constant) [BIT] [ULT] . 

Analogous definitions and properties can be obtained for T : ^ . 
This also includes the case of T{x) = —2x (mod 1). 

We could, likewise, to consider, when the analogous problem for A*: 
given A* (obtained from A) fixed, denote 7* : E* — )• M, the non-negative 
deviation function for fif^A* — > fJ-to- 

Denote c*{x,y) = {I*{y) - W{x,y) + 7). 

The consider the problem 



Ci^,^,f,l,) = ^ inf / l(r{y)-Wix,y) + ^)dfj 



inf c*{x,y)df}= inf / I {-W{x,y) + -f) di), 

which have the same minimizing measures, as for the minimization for 
c{x,y) = {I{x) — W{x,y) +7) among probabilities on /C(^ooiM^)- 
Note also that from proposition 3 in ^BLT] we have 

<i,pA{x) = I e^'^^^-'y^-'Pj-^^dixpA'{y) = 

„r WA(x,y)-cp-\oii4>fiA*{y) d^^^, (y). 
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In the same way as before one can show that for any x S E, we have 

V{x) = {-V*)* = suv{-V*{y)-c*{x,y)}. (22) 
yes* 

Note that c{x,y) = c*{x,y) in the support of the minimizing fimax for c 
(or for c*) . 

Remark 3: It is not necessarily true that ( {—V*)^, )^ = —V*. However, 
the expression is true when restricted to the support of the optimal transport 
probability fimax- In the same way ( {—V)c )T = —V in the support of fimax- 



3 Graph properties and the twist condition 

Consider a lower semi-continuous continuous cost function c{x, y) on S (or, 
a continuous cost function —W{x,y) on S). We refer the reader to |Ra| 
|Vilj |Vi2j and |GM) for general references on transport mass problems. 

Definition 9. A set C S is called c-cyclically monotone, if for any finite 
number of points {xj,yj) in S, j € {l,2,...,n}, and any permutation a of 
the n letters, we have 

n n 

^c{xj,yj) < ^c(x^(j),yj). (23) 
i=i i=i 

Proposition 5. (see Theorem 2.3 |GM) ). For a continuous function c(x, y) > 
0, where S, if /) E /C(^oo5^^) is optimal for c, then, p has a c-cyclically 
monotone support. 

Corollary 1. The support of fimax, the natural extension of ^oo is c- 
cyclically monotone. 

We will present bellow in the next theorem a direct proof of this fact. 

Definition 10. A function / : S — ?> M U {oo} is c-concave, if there exist a 
set A C S X M such that 

f{y) = sup {c{x,y) + X} 

{x,x)eA 

Definition 11. A function / : X — ?> M U {oo} is c-convex, if (— /) is c- 
concave. 

Definition 12. Given x € S, the set dcf{x) is the set of y G S such that, 
for all z € S we have 

f{z) - f{x) < c{z,y) - c{x,y) 
In this case we say y is a c-sub-derivative for / in x. 
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An important problem is to know, for a certain given x, if the dcf{x) 
has cardinahty 1. 

Proposition 6. (see Theorem 2.7 in |GMj . Lemma 2.1 in [R] and section 4 
in [Raj). For 5 C S to be c-cychcally monotone, it is necessary and sufficient 
that S C = {{x,y)\f{z) - f{x) < c{z,y) - c{x,y),yz G X}, for 

some c concave /, where / : S — t- M U {oo}. 

Moreover: / is defined in the fohowing way: choose (xo,2/o) ^ S, then 

f{x)= inf [{c{x,yn) - c{xn,yn))+ 

nGN, (xj,yj)eS, l<j<n 
{c{Xn,yn~l) - c(x„_i,y„_i) ) + ... 

+ (c(x2,yi) - c{xi,yi) ) + (c(xi,yo) - c(xo,yo))]- 

We assume, without fost of generahty that m{A) = 0. 

Note that if 5" C S is a graph, then for each x G S in the x-projection 
of S, we have that dc{f){x) has cardinality 1. 

Consider fixed (xo,2/o), ixi,yi) in the support of /imaa; and (xo,yi), (xi,yo) G 
Given a function f{x,y) we denote 

^f{{xo,yi),{xi,yo)) = {fixo,yo) + f{xi,yi))-{f{xo,yi) + f{xi,yo)). 

(24) 

Denote 

b{x,y) = I{x)+^-W{x,y) + V{x) + V*{y). (25) 

The c-cyclically monotone condition for the support of flmax will follow 
from the claim 



'^c{{xo,yi),{xi,yo)) = (c(xo,yo) + c(xi,yi)) - (c(xo,2/i) + c(xi,yo)) < 0. 

(26) 

This is so because any permutation of letters can be obtained by a series 
of composition of transformations that exchange just two letters. 
It will follow from the proof bellow that Ac o a = Ac 

Theorem 4. Given ^ : S — > M Holder, then c(x, y) = /(x) — W{x, y) + j > 
0, for all {x,y) £ S. Moreover, for (xo,yo)) ixi,yi) in the support of jlmax, 
we have Ac < 0. Therefore, the support of fimax is c-cyclically monotone. 

Proof: First we point out that Ac = A^. We will show that under our 
hypothesis is true that Af, < 
First note that 



[V* oa-^ - V* - A*]a{x,y) = [V* - V* oa - A-W + W oa] {x,y) = 
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[7 + V{x) + V*{y) - W{x, y)] + [Voa - V - A]{x, y) - 
[-f + V o a + V* o a - W o a]{x,y). 
Remember (see |BLT) ) that 

00 

I{x) = Y,[y ° ^-V -A]a^ {x,y) 

n=0 

We denote 

n-l 

In{x,y) = ^[V o a - V - A] o {x,y) = In{x), 

k=0 

and 

Rnix,y)= In{x,y) + [-f + V{x) + V*{y)-W{x,y)]- 
[j + V + V* - W]a'' {x,y). 

We claim that if {x,y) is in the support of fimax, then b{x,y) = 0. 
Moreover, for aU {x,y) G S, we have b{x,y) > 0. 

One can prove this result by means of Varadhan's Integral Lemma ( |DZ) ) 
with the same reasoning as in the last proposition of the previous section. 
We will give bellow a direct proof of the claim. 

Either I{x) = 00, and the claim is trivially true or I{x) is finite. In this 
case, any accumulation point of cT^{x,y) will be in the support of fimax- 

Moreover, b{x,y) = R{x,y) = lim„^oo Rn{x,y) > 0. 

As in the support of fimax, we have that R{x,y) = 0, then, h{x,y) = 0. 

In any case R{x, y) > 0. This shows the claim. 

We point out that Ac = A^, = A\y in the case /(x) is finite. 

We also remark that if (xo,yo) is in support of fimax, then as R{xo,yo) 
is zero, it follows that R{xo,y) is finite . This is so because {xo,y) is in the 
stable manifold of (a;o,2/o) and 

Rn{xo,y) - Rn{xo,yo) = 

n 

Y^{[V*o a-' -V*- A*]a'{xo,y) - [V* o a^' -V*- A*]a''{xo, yo) } 

k=l 

Finally, if (xo,yo) and are both in the support of fimax, then 

R{xo,yi) < 00, i?(xi,yo) < 00 and /(xq) = = I{xi). 

In this case, for any {x,y) of the form (xq, yo), (a^i, yi), (a^i, 1/0), or (xo,yi) 

R{x,y) = I{x,y) + [-f + V + V* -W]{x,y) = b{x,y). 
As we know that R is non-negative, then 
[b{xo,yo) + Kxi,yi)] - [bixi,yo) + b{xo,yi)] =0 - [b{xi,yo) + b{xo,yi)] < 0. 
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This shows that A;, < 0. 

□ 

We did not use the twist condition above. 

Note that we could alternatively consider the function 5 : S — >■ R defined 
in the following way: choose {xo,yo) € S, then 

g{x)= inf [{Wix,yn) -W{xn,yn))+ 

neN, (Xj ,yj ) eS, l<j <n 
( W{Xn, Vn-l) - W{Xn-l,yn-l) ) + ••■ 

+( W{x2, yi) - W{xi,yi) ) + ( W{xi,yo) - W{xo, yo) ) ], 

which has the advantage of just taking into account a continuous function 
W. 

The graph property for S = support of /t, and all kinds of different 
considerations can be obtained from such g. 

We want to show now that if W satisfies the twist condition and the 
maximizing probability for A is unique, then the support of /t on S is a 
graph. Our proof works for the Bernoully space {0, l,2,..,d}'^ as well for 
the interval [0, 1] (considering T either conjugated to 2x (mod 1) or to —2x 
(mod 1)). 

Consider the cost c{x, y) = I{x) — W{x, y) — 7, and a subset S d X xY 
c-cyclically monotone. 

Lemma 1. Suppose the c satisfies the twist condition and let S" be a c- 
cyclically monotone subset, if (a, 6), (a', h') € S and a ^ a' and h / 6', then 
a < a! and 6 > 6', or a > a! and h < b' . 

Proof. Indeed, suppose a < a' then, if 6 < 6', the twist condition on W 
implies that 

c(a, b) + c(a' , b') > c(a, b') + c(a', b). 
On the other hand, S is c-cyclically monotone subset, so 

c(a, b) + c{a , b') < c(a, b') + c(a', b), 

that is an absurd. □ 



A similar property is true for W. 

This Lemma means that the correct figure associated to a pair of points 
in S is given by: 
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Forbidden zone 
;associated to (a,b) 



Forbidden zone 
associated to (a,b) 



X 



Characterization of S 



We point out that, in principle, could exist points z oi S in the vertical 
fiber passing by a or in the horizontal fiber passing by h. 
Now we will show Theorem 3. 

Theorem 5. (Graph Theorem) Suppose the VF-kernel satisfies the twist 
condition and let /t be the c-minimizing measure of probability to the trans- 
port problem, then S = supp /i is a graph in x (up to an orbit of measure 
zero), moreover this graph is monotone not increasing. 

Proof. In order to get advantage of the geometrical and combinatorial argu- 
ments we will present pictures for the case of a transformation T : [0, 1] — > 
[0, 1], given by T{x) = 2x (mod 1). 

Define v'^{x) = max{y|(x,y) € S} and v~ {x) = min{y|(x,y) G S}. In 
order to prove that supp /i is a graph we need to prove that v~{x) = v'^{x) 
for any x in the support of /Xqo- 

We say a point (x, y) in the support of ft is non-graph, if there exist 
another point of the form (a;, z), in the support of fi, and such that z ^ y. 

Note that the image of two points in the support of fl on the fiber over 
x will go on two different points in the support of fi on the fiber over a{x). 
That is, the forward image by o"" of non-graph points will go on non-graph 
points. This maybe can not be true for backward images by a". 

Suppose the support of the maximizing probability ^oo (unique) is a 
periodic orbit. If S is not a graph, then v~{x) < v'^{x) for some x. As the 
transformation a contracts each fiber by forward iteration, we have that, the 
image of the interval fiber from (x, v~ (x)) to (x, v^{x)), by a finite iterate of 
a, goes inside the fiber {x,v^{x)) to (x,f+(x)). Therefore, a* has a periodic 
point in the support of /ij^. If the maximizing probability /ioo is unique for 
A, then fi'^ is unique for the maximization problem for A* . In this case the 
support of is this periodic orbit. Therefore, there is a minimal distance 
(in vertical fiber) between non-graph points and this is in contradiction with 
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the contraction on vertical fibers. The conclusion is that 5 is a graph if the 
support of the maximizing probability /Xqo is a periodic orbit. 

Remark 4: In the case of the shift, if supp//oo is a periodic orbit, one 
can easily show that if 



then 



supp^oo = the orbit by a of (ao, ai, a(„_i) , oq. 



supp^^ = orbit by a* of (a(„_i), 02, oi, ao, a(„_i) , 




a ff(a) X 
Support of fi in the periodic case. 

We suppose from now on that the support of the maximizing probability 
/Lioo is not a periodic orbit. 

















Forbidden Zone 



1 X 

Characterization of 5* 



Suppose, that v~{x) < v^{x) for some x, then we claim that there is 
no other point in support of fl in the fiber by x between pi = v~{x) and 
P2 = v~^{x). Indeed, from the above picture we see that if there exists a 
point {x,p) in the support of fi such that = pi < p < P2 = v~^{x), 

then, as fi is ergodic, should exist a point {qi,q2) in a small neighborhood 
V of {x,p) such that returns by a forward n-iterate by cj to y. 

This iterate has to return to the fiber, and this contradicts the fact that 
the support of the maximizing probability ^Uoo is not a periodic orbit. 

If the support of /ioo is not a periodic orbit, then we claim that there 
does not exist two pairs (xi,yi), {xi,zi) and {x2,y2), {x2,Z2), in the support 
of fi, such that, the orbits by a of xi and X2 are different. 
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In order to simplify the argument and notation we consider bellow T* (x) = 
2x (mod 1), but we point out the reasoning apply to any expanding trans- 
formation of degree d. Given y„ and Zn, n = 1,2, there exist a rational point 
of the form s„ = with < g < 2^, g, fc G N, such that Un < Sn < Zn, 
n= 1,2. Consider the s„ determined by the smallest possible value k. 



The pair of points T~'^{xn,yn) and T~^(xn, Zn), r > 0, determine non- 
graph points in the same fiber, for any r > 0, until time r = k. In time 
r = A; — 1, it happens for the first time that the horizontal fiber through 1/2 
cuts the vertical segment connecting T~^^~^\xn,yn) and f-i^-^){xn,Zn). 



In this way, for each n, we get a horizontal forbidden region An (a hori- 
zontal strip from one vertical side to the other vertical side of [0, 1] x [0, 1]) 
determined by such pair T^~^(x„, yn) and T~^^~^\xn, Zn), n = 1,2, which 
contains the horizontal fiber through 1/2 . 



If we apply the argument for n = 1, then the next forbidden region 
A2 for n = 2 will contain the previous one Ai. Moreover, considering the 
full forbidden region determined by these two pair of points we reach a 
contradiction. 



In the picture bellow we show the final pair of points qi and q2 in a a- 
orbit (in the same vertical fiber) which has the property that its images pi 
and p2 are on different sides of the upper and down rectangles. The images 
of pi and p2 by a are not anymore in the same vertical fiber (neither their 
future iterates). There is no room for getting a different pair of p\ and p2 
like this (because of the forbidden region) . 



In this way, form above, we get that could exist just one orbit of x by o" 
such that over the fiber over x there is two points in the support. That is, 
the projection K <zTi on the x-axis of the non-graph points have to be the 
orbit of a single point x. Therefore, HooiK) = X^fc /"oo({c'^(2^)})- 



We assume first that the set of non-graph points have probability 1 and 
we will reach a contradiction. Indeed, fioo{{a^ (x)}) > ^ix>{{a^ (x)}), for 
k > j, and the /^oo probability of the set {x} is zero or is positive. 
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Remember that the support of fi is invariant by a. 
Now we win show that, indeed, if there exists non-graph points, this set 
has probabihty 1. 

Note that if the vertical fiber by x G E is such that v~{x) < v~^{x), then 
ct(x) also has this property. If the transformation a we consider preserves 
orientation in the vertical fiber then the iterates are in the same order. 
Otherwise they exchange order. That is, the set of points which are 

not graph point are invariant by forward iteration by a. Moreover, a is a 
forward contraction in vertical fibers. Denote by -B = {{x,v~^{x)) in the 
support of fi such that v~{x) < v~^{x) }. The set B is the upper part of the 
non-graph part of the set S. 

We win show that ft^B) = or = 1. 

We suppose first that a preserves order in the fiber by forward iteration. 

Consider B the set { {x, y) in the support of fi such that for some n > we 
have cj" (x, y) € i? }. Note that as B is forward invariant, once a" (x, y) G B, 
for some fixed n, then o"™ (x, y) G B, for any m > n. 

We will show that a^^B = B. The fact that a^^B C B follows easily 
from the definition of B. 

Given x (z B, there exists n > such that a'^ {x,y) £ B. If n > 1, 
then a'^~^ {a{x,y)) G B and, therefore, (x,y) G a^^B. In the other case 
(x,y) G B, but then (a{x,y)) G B, because a preserves order in the fiber, 
and does not exist more than two points in the vertical fiber over o"(x) which 
are in S. Therefore, (x,y) G a~^B. 

As fi is ergodic, then fi{B) = or fi{B) = 1. 

If fi{B) = 1, then take a Birkhoff point z £ B for the ergodic probability 
jl. Therefore, we get that the asymptotic frequence of visit to the set C = 
{ (x,i)~(x)) in the support of fi such that v~ {x) < v~^{x) } (the bellow part 
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of the non-graph part of set S) is zero. Finally, we get that /i(C) = 0. In 
the same way il{B) = 1. 

If /i(-B) = 0, we get that fi{B) = 0. Now, using a similar argument for 
the lower part of the non-graph part we get that fi{C) = 1. 

This shows that the tti projection of the non-graph points has probability 
one and this proves the theorem. 

□ 

The above reasoning also applies to T{x) = —2x (mod 1) and to the 
shift in the Bernoulli space. 



4 Selection of minimizing sequences 

In this section we want to exhibit a nice expression for the function / (defined 
before) such that, the set {{x, dc f (x)) \ x € support //oo} = support of fimax-, 
in the case the support of jlmax is a periodic orbit. In the end of the section 
we address briefly the general case. 

Definition 13. We say c:S = SxS— >-R upper semicontinuous satisfies 
the twist condition on S, if (bellow we just consider values of c which are 
finite) for any (a, 6) G S = S x S and (a', 6') G S x S, with a' > a, b' > b, 
we have 

c(a, b) + c(a', b') > c(o, 5') + c(a', b). (27) 

If W is twist and c{x, y) = I{x) — W{x, y) +7, then c is twist. We assume 
from now on this property. 

Theorem 6. Suppose the support of jlmax is a periodic orbit. Choose 
{xo,yo) in such way that G S is the smaller point in the projection and 
yo G S the smaller on the fiber over xq. Prom the above, in this case for any 
given z G S, the / defined before is such that 

f{z) = [ ( C{Z, yn) - c(Xn, ?/„) ) + 

l) - c{Xn-l,yn-l)) +- 
+ ... + ( C{XZ, 2/2) - C(X2, 2/2) )} + 

(c(x2,yi) - c{xi,y{)) + {c{xi,yo) - c{xo,yo)) ]., 

where we use all the possible Xi which are in the support of the maximiz- 
ing probability for A on the left of z, and for each Xj we choose the corre- 
sponding yi. In the notation of / above, the last one (x„, yn) = {xn{z), yn{z)) 
is such that (a;„(z), y„(z)) = {xk-i,yk-i)- Which means n = k — 1. 

Moreover, xo < xi < X2 < ■■■ < Xn- 
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li z = Xk for some clement in the support of ha_, then That is, in 
the notation of / above, if x^^i < z < x^, then {xmUn) = {xn{z),yn{z)) is 
such that {xn{xk),ynixk)) = (xfc_i,yfc_i). The case z = Xfc is include in the 
expression above for /. In this case x^ = Xn+i following the above notation. 
The index of the Xj has no dynamical meaning. 

Proof: 

Consider the cost c{x, y) = I{x) — W{x, y) — 7, and a subset S C X x Y 
c-cyclically monotone. Also, assume that c verifies the twist condition: If 
a < a' and b < b' then 

c(a, b) + c(a', 6') > c{a, b') + c(a', b). 

In this way, the definition of c implies that: 

W{a, b) + W{a', b') < W{a, b') + W{a', b). 

Define A{x,x',y) = W{x,y) — W{x',y), so the twist condition can be 
restated as: if a < a', and b < 6', then 

A{a,a',b) < A{a,a',b'). 

Therefore, if we define the map y — )• A(a, a', y) we get a increasing map. 
Observe that: 

i) A{x,x',y) = -A{x',x,y) 

ii) A{x,x,y) = 

iii) A{x, x', y) + A(x', x", y) = A(x, x" , y) 

In particular the map, y — A(a', a, y) is decreasing if a' > a. 

Given f : X —^M a c-convex function we define the c-subderivative of / 

in X G X as being the set: 

dj{x) = {ye Y\f{z) - fix) < c{z,y) - c{x,y)yz € X}. 

Using c(x, y) = I{x) — W{x, y) — 7 we get, 

dcf{x) = {ye Y\fiz) - fix) < I{z) - I{x) - [W{z, y) - W{x, y)], Vz G X}. 

We know that S is c-cyclically monotone, if and only if, S C dcf{xo) 
where / is a c-convex function given by: 

n 

f{z) = min(^^.^y.)cs,i=i..n^c{xi+i,yi) - c{xi,yi) 

i=0 

where (xo,yo) G 5* is as fixed point and Xn+i = z. Using c{x,y) = I{x) — 
W{x, y) — 7 we get, 

n 

f{z) = min(^^.^y.)cs,i=i..n^I{xi+i) - I{xi) - W{xiJ^i,yi) - W{xi,yi)\ = 

i=0 
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n 

= 'min(^^^y^)^s,i=i..nl{z) - I{xo) + ^ A(xi, x^+i, j/j)- 

i=0 

Lemma 2. If, {xi,yi) C 5, i = 0, 1,2 is such that xq < xi < X2 < z and 
y2 < 2/1 < yo then, 

A(xo, xi, yo) + A(xi, z, yi) > A(xo, xi, yo) + A(xi, X2, yi) + A(x2, z, y2) 

Proof. Observe that, A(xi,z,yi) = A(xi, X2, yi)+A(x2, -z, yi) > A(a;i, X2, yi)+ 
A(x2,2:,y2), because A{x2,z, •) is increasing and yi > y2- 

□ 

Lemma 3. If, {xi,yi) C S,i = 0,1,2 is such that xq < xi < z < X2 and 
y2 <yi < yo then, 

A(xo, xi, yo) + A(xi, z, yi) < A(xo, xi, yo) + A(xi, X2, yi) + A(x2, z, y2). 
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Figure 1 - bad 
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Figure 2 - good 

In particular, 

A(xo, xi, yo) + A(xi, z, yi) < A(xo, X2, yo) + A(x2, z, y2). 

Proof. Observe that, A(xi,z,yi) = A(xi, X2, yi)+A(x2, -z, yi) < A(xi,X2,yi)+ 
A(x2,z,y2), because A(x2,-z, ■) is decreasing and yi > y2- 
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Figure 4 - good 



Now observe that, 

A{xo,X2,yo) + A{x2,z,y2) = A(xo,xi,yo) + ^(xi, 2:2, yo) + A(x2,2:,y2) > 
A{xo,xi,yo) + A(xi, X2, + A(x2, ^, 2/2) > A(xo, xi, 2/0) + A(xi, z, yi). 

□ 



Now one can generalize the idea above: Suppose that, {xi,yi) C S,i = 
0, 1,2, ...,n is such that xq < xi < ... < x/^. < z < x^+i < ... < Xn and 
y„ < ... <y2 <yi <yo then, 
A(a;o,a;i,yo) + A(xi,X2,yi) + ... + A{xk,z,yk) < 
A{xo,xi,yo) + A(xi,X2,yi) + ... + A(x„,z,y„). 

In order to see this, we proceed by induction in the right side of the 
inequahty above: 

A(a;„_i, x„, y„_i) + A{xn, z, yn) > 
A(2;„_i, x„, yn-i) + A{xn, z, yn-i) = 

A(Xn-l,Z,yn-l) 

In this step we discard the pair {xn,yn)- We must to repeat this process 
while n — j > k, discarding all points in the the right side of z. 

So the conclusion is, that we can discard all point in the right side of z 
decreasing the sum, and we can introduce a point between the last point in 
the left size of z, and z, decreasing the sum (see Figures 3 and 4). 
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Figure 5 

We discard (a;2) 2/2), (a^s, ys), (x4, 2/4), from right size and insert {A, B) 
between and z. 
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The good one 

Figure 6 



The case in which z < xq must be analyzed now: 



X5 X4 X3 Z X2 XI XO 



Figure 7 - bad 
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Z X2 X1 XO 

Better ^ 

Figure 8 - good 

Observe that: 

A(xo, xi, yo) + A(xi, X2, yi) + A(x2, X3, ^2) + A(x3, 2:4, ys) + A(x4, X5, 7/4) + 
A(x5, 2,2/5) > 

A(xo,a;i,yo) + A(xi,X2,yi) + A(x2,X3,y2) + A(x3,X4,y3) + [A(x4,X5,y4) + 
A(x5,z,y4)] = 

A(xo,xi,yo) + A(xi,X2,yi) + A(x2,X3,y2) + A(x3,X4,y3) + A(x4,z,y4), 
and successively to eliminate 4 and 3. 

Now we check what happen with permutations of the order in the pro- 
jected points. 



Note that the sum 



^c(xi+i,yi) - c{xi,yi) 



i=0 

can change by sorting the sequence of points {xi,yi) C S,i = l..n. So we 
need to consider the natural question about the better way to rename this 
points. 

Please, check the bellow figure: 

Fi 



XO X4 XI 



X3 X2 X5 



Figure 9 - too bad 

We claim that it is possible discard all the points at the right side of z 
and also all the points between xq and z that are no ordered in order to 
minimize the sum above. 
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In fact: A(xo,2;i,yo) + A(xi,X2,yi) + A(a;2, 2:3, 7/2) + A(x3,X4,y3) + 
[A(x4, X5, ^4) + A(x5, z, ys)] > 

A(xo, xi,yo) + A(xi, X2, yi) + A(x2, X3, ^2) + [A(x3, X4, ys) + A(x4, z, ^4)] > 
A(xo,xi,yo) + [A(xi,X2,yi) + A(x2, X3, ^2)] + A(x3,z,y3)] > 
A(xo, xi,yo) + [A(xi, X3, yi) + A{xs,z, ys)] > 
A(xo,xi,yo) + A(xi,z,yi). 

So the sequence (xo,yo), {xi,yi) in this order minimize this sum. 

We know that the graph property is true. But suppose we have a more 
general case where A(x, z, y) can be consider and we do not have the graph 
property. 

Consider the sequence (xq, yo), yi) and suppose z > xi > xq. Ad- 
ditionahy suppose that (xi,.) (1 S ^ {yi}i so we can compares the sum 
A(xo,xi,yo)+A(xi,z,yi) with A(xo,xi,yo)+A(xi,z,y). for any y G (xi,.)n 

We claim that this function is monotone increasing in y. 
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Figure 10 - going down is better 

In fact suppose that y' < yi < y" < yo, as in Fig. 8. 
Observe that, A(xi,2;, yi) < A(xi,z,y") and A(xi,z,yi) > A(xi,2;,y') 
because xi < z. 

The conclusion is that if the support of fimax is a periodic orbit, then, 
we choose (xo,yo) in the support of fimax- 

From the above, in this case given 2; G S, then 

f{z) = [ ( c(z, yn) - c(x„, y„) )+ 

(c(x„,yn_i) - c(x„_i,y„_i) ) + ... 

+... + ( c(x3, y2) - c(x2, y2) ) } + 

(c(x2,yi) -c(xi,yi)) + (c(xi,yo) -c(xo,yo)) ]•, 

where we use all the possible Xj, i = 1,2, ..,n, on the left of z , and for 
each Xi we choose the corresponding yi such that {xi,yi) is in the support 
of fimax- Moreover, xq < xi < X2 < ... < x„. 

Finally, we can say that dcf{xk) = y^, for any k. 
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One can get similar results for the function g (obtained just from the 
kernel W) defined before. 

From the reasoning above (for the case of W satisfying the twist con- 
dition), in the case fioo is not a periodic orbit, then in definition of /, the 
infimum is not attained in a finite sequence of Xn in the support of n^o ■ 



5 Appendix 

Here we consider first the shift S = {0, 1}^, and S as a metric space with 
the usual distance: 



Additionally, we suppose that S is ordered by x < y, if Xi = Ui for i = 
l..n — 1, and Xn = and ?/„ = 1. 

As the usual, we consider the dynamical system (S, cj) where cr : S — )■ S 
is given by a{x) = a{xi,X2,X3, ...) = {x2,X3,X4, ...). 

a) Potentials and the involution kernel 

As usual we denote 



the skew product map, where a*{y = (yi, y2, ys, •••)) = (^2, ys, 2/4, •■•)• 

Given a continuous function A : S — > M, remember that a continuous 
function : S x S — )• M is an involution kernel for A if (W o — W + 
A o a~^){x,y) does not depends on x; In this case the continuous function 
A*{y) = {W oa-^ -W + Ao a-'^){x, y) is called the T^-dual potential of A. 

As in [ BLTj we define the cocycle Ayi(x,x',y), where 




if X = y 

(1/2)"- if n = mm{i\xi ^ yj. 



Txiy) = {xi,yi,y2,y3,-) and Ty{x) = (yi,xi,X2,X3,...) 



and 



a{x,y) = {a{x),T*{y)) and a ^{x,y) = {TyX,a*{y)) 



A^(x,x',y) = ^^oo- "(x,y)-Aoo- "(x',y) 



n>l 




n>l 



and its dual version A^*(x,y,y'), where 



A^. (x, y, y') = Y.A*o a^{x, y) - A* o a^{x, y') 



n>l 



= ^A*or;,(y)-^*or„,,(y'). 



n>l 
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Note that: 

i) /S.a{x,x' ,y) = — Aa(x', ,T, y), in particular A^(a;,x,j/) = 0, 

ii) Aa(x,x',7/) + AA(x',x",y) = Aa(x,x",7/), 

iii) Aa(x, x',y) = ^A{TyX, Tyx' , a*{y)) + [Ao TyX - Ao Tyx'], 
and the same relations are true for AA*{x,y,y'). 

Using this properties one can prove that, for any involution kernel we 
have W{x,y)—W{x' ,y) = Aa{x,x' ,y) and W{x,y)—W{x,y') = AA*{x,y,y'). 

Prom this fact, we get that the difference between two involution kernels 
for A is a continuous function of y: 

{Involution kernels for^}/C°(S) = W^, 

where W^{x,y) = Aa{x,x' ,y) for a fix x' G S is called a fundamental invo- 
lution kernel of A. Indeed, the property (iii) shows that is an involution 
kernel for A. 

On the other hand, given another involution kernel, W we have W{x,y) — 

W{x',y) = AA(x,x',y), thus 

W{x, y) = W{x', y) + Aa{x, x', y) = W{x', y) + W\x, y) = g{y) + W\x, y), 

where g{y) = W{x',y) G C°(E). 

As an example we compute the general dual potential. First for W^{x, y) = 
AA{x,x',y) we get: 

AUy) = iW\TyX,a*iy)) - (x , y) + Air^x) 

= AA{TyX,x' ,a*{y)) - Aa{x,x' ,y) + A{Tyx) 
= Mv') + ^A{Tyx',x',a*{y)). 

Given another involution kernel, W we have W{x, y) = W{x' , y) + W^{x, y) 
thus 

A*{y) = {Woa-^-W + Ao a-^){x, y) = W{x', a*{y)) - W{x', y) + Al{y). 

b) The twist property of an involution kernel 

If ^ : S — >■ R is a potential and W an arbitrary involution kernel for A, 
as we said before, W has the twist property, if for any, a, b,a',b' eT: 

W{a, b) + W{a', b') < W{a, b') + W{a!, b), 

provided that a < a' and b < b'. 
If we rewrite this inequality as, 

W{a, b) + W{a, b') < W{a, b') + W{a', b) 
W{a, b) - W{a', b) < W{a, b') - W{a', b') 
AA{a,a',b) < AA{a,a',b'), 
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we get an alternative criteria for the twist property, that is, W has the 
twist property, if for any, a, a' G S the function y — > A/i(a, a',y), is strictly 
increasing, provided that a < a' . 

Remark 5 This characterization shows a very important fact. The twist 
property is a property of A, so we can said that A is a twist potential or 
equivalently A has a twist involution kernel (as, obviously other involution 
kernel is also twist). 

Remark 6 As an initial approximation we can consider a different set- 
ting of dynamics. Let T{x) = —2x mod 1, and 

11, 1 

tqx = --X + -, and nx = --x + 1, 

the inverse branches that defines the skew maps (that are not the actual 
natural extension of T) : 

f{x,y) = {T{x),T*{y)) and f-\x,y) = {TyX,T*{y)). 

So, one can compute an involutive (that is, A*{y) = A{y)) smooth kernel 
for Ai{x) = X and A2{x) = x^ given by 

W^i(a;,y) = -\{x^y) and W^ix.y) = ^(x^ + y^) - ^xy. 

As a corollary we get that any potential A(x) = a^hx^cx^ has a smooth 
involution kernel given by VF(x, y) = a + hW\{x^ y) + cW2{x, y). 
Here and in the next paragraphs, we will denote 

WAix,y) := a + bWi{x,y) + cW2{x,y), 

where A{x) = a + bx + cx"^ is a polynomial of degree 2. 

We observe that the twist property can be derived from the positivity of 
the second mix derivative of the involution kernel when it is smooth. Note 
that, 

d'^Wi _ ^ d'^W2 _ _4 
dxdy ' dxdy 3' 

thus Wi is not twist and W2 is. Actually any potential A{x) = a + hx + cx^ 
where c > is twist. 

Remark 7 In this remark we are going to consider the case of A{x) = 
a + hx + cx^ where c < (not twist). In this case we will be able to compute 
the calibrated subaction explicitly, which, we believe, it is interesting in 
itself. 

As a first example consider A{x) = — (x — 1)^ which is a convex potential. 
From [JS] |J6] we get that the unique maximizing measure for this po- 
tential is ^00 = (52/31 so the critical value is m = yl(2/3). Using the fact 
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that that ra = A(2/3) one can show that there is a unique (up to constants) 
cahbrated subaction (p given by: 

0(x) = W{x, 2/3) - W{2/2>, 2/3) = + \x 
where the kernel is given by 

W{x,y) = -(l/3)x2 - (l/3)y2 + (4/3)xy - (2/3)x - (2/3)y. 

As a second example consider A{x) = — (x — ^)^ which it is also a concave 
potential. 

The general arguments in |JSj shown that any maximizing measure for 
this potential is /Xoo = (1 — + ^<^2/3i where t € [0,1], so the critical 

value is m = A{\/2>) = ^4(2/3). In this case the involutive smooth involution 
kernel is: 

W{x,y) = -(l/3)x2 - (l/3)y2 + (4/3)xy - (2/3)x - (l/3)y. 

It is easy to verify that, 

,/.(x) = {W{x, 1/3) - H^(l/3, l/3))x[o,i/2) {x) + W{x, 2/3) - 1^(2/3, 2/3)x[i/2,i] i^) 
= max{VF(2;, 1/3) - 1^(1/3, 1/3), W{x, 2/3) - 1^(2/3, 2/3)} 

is a calibrated subaction for A. 

In order to prove this, define 
V^{x) = W{x, 1/3) - W{l/3, 1/3) = A(x, 1/3, 1/3) = -(1/3)^2 + (l/9)x, 
V2{x) = W{x,2/3) - 1^(2/3,2/3) = A(x,2/3,2/3) = -(1/3)^2 + (5/9)x - 
2/9, and 

(pix) = li(x)X[o,i/2)(2;) +^2(a;)X[i/2,i](2;) = max{yi(x), V2(x)}. 
Note that, 

(pirox) = Vi(rox)X[o,i/2)(rox) + V"2(rox)X[i/2,i, (tqx) 
= yi(rox) = A(tox,1/3,1/3) 

= A(Ti/3X,Ti/3l/3,rn/3) 

= A(X, 1/3, 1/3) - [^(ti/3x) - ^(Ti/3l/3)] 

= Vi(x) — [A^tqx) — m]. 

Thus (/)(tox)+j4(tox)— m = Vi(x). Analogously, 4>{Tix)+A{Tix)—m = V2(x) 
so 

(f){x) = max{yi(x), V2(x)} 

= max{(/)(Tox) + A{tox) — m, (/>(rix) + A{tix) — m} 
= max{(j){Tyx) + A{Tyx) — m}. 
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c) Twist criteria 

Is natural to consider a criteria for the twist property for a class of func- 
tions that has a small dependence on the cubic (or higher order) terms. Let 
P2 = {p{x) = a + bx + cx^ I c > 0} be the set of strictly convex polynomial. 
Consider p G , and define 

CM = {Ae C^([0, l])\Aix) = p{x) + eR{x), —R e C\[0, 1])} 

Theorem 7. For any p G P2, there exists £ > such that all A G Ce{p) is 
twist. 

Proof. Consider p G P2 fixed. So , p has a smooth and involutive involution 
kernel given by 

Wp{x,y)=p{W){x,y) = {a + bWi + cW2){x,y), 

that is, p*{y) = p{y), where Wi{x,y) = —^{x + y) and W2{x,y) = + 
y^) — are the involution kernel associated to x and .x^ respectively. Let, 
A = p + eR G {p) , and Wr be the involution kernel for R. Since R is C^ 
we get that, is Wr is C^ in the variable x. 

Using the linearity of the cohomological equation, we get WA{x,y) = 
p{W){x,y) + eWR{x,y), and differentiating with respect to x, we have 

—WA{x,y) =-^p{W){x,y) + e—WR{x,y) 

12 4 d 
- -6 + -ex - -cy + e—WR{x, y) 



Since -|c < 0, and ^WR{x,y) G C°([0, 1]^) the compactness of [0,1]^ 
implies that -^Wa{x, •) is a strictly decreasing function for any e small 
enough, what is equivalent to the twist property. □ 

Remark 9 If, ^ G C°°([0, 1]) is strongly convex, we can consider a 
perturbation of A of order 2 given by 

B,{x) = A{0) - A'{0)x + -^x^ + e p^^" ^ C,{pa), 

n>3 

where pA = A{0) - A'{0)x + ^^x"^ G P^ . Thus, we can find eo > such 
that is twist for any < e < stq- 
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